---
title: "Memory"
description: ""
icon: "brain"
---

## Overview

Memory in the context of an agent refers to the system's capability to store, recall, and utilize information from past interactions. This enables the agent to maintain context over time, improve its responses based on previous exchanges, and provide a more personalized experience.

BeeAI framework provides several memory implementations:

| Type | Description |
|------|-------------|
| [**UnconstrainedMemory**](#unconstrainedmemory) | Unlimited storage for all messages |
| [**SlidingMemory**](#slidingmemory) | Keeps only the most recent k entries |
| [**TokenMemory**](#tokenmemory) | Manages token usage to stay within model context limits |
| [**SummarizeMemory**](#summarizememory) | Maintains a single summarization of the conversation |

<Note>
Supported in Python and TypeScript.
</Note>

---

## Core concepts

### Messages

Messages are the fundamental units stored in memory, representing interactions between users and agents:
- Each message has a role (USER, ASSISTANT, SYSTEM)
- Messages contain text content
- Messages can be added, retrieved, and processed

### Memory types

Different memory strategies are available depending on your requirements:
- **Unconstrained** - Store unlimited messages
- **Sliding Window** - Keep only the most recent N messages
- **Token-based** - Manage a token budget to stay within model context limits
- **Summarization** - Compress previous interactions into summaries

### Integration points

Memory components integrate with other parts of the framework:
- LLMs use memory to maintain conversation context
- Agents access memory to process and respond to interactions
- Workflows can share memory between different processing steps

---

## Basic usage

### Capabilities showcase

<CodeGroup>

{/* <!-- embedme python/examples/memory/base.py --> */}
```py Python [expandable]
import asyncio
import sys
import traceback

from beeai_framework.backend import AssistantMessage, SystemMessage, UserMessage
from beeai_framework.errors import FrameworkError
from beeai_framework.memory import UnconstrainedMemory


async def main() -> None:
    memory = UnconstrainedMemory()

    # Single Message
    await memory.add(SystemMessage("You are a helpful assistant"))

    # Multiple Messages
    await memory.add_many([UserMessage("What can you do?"), AssistantMessage("Everything!")])

    print(memory.is_empty())  # false
    for message in memory.messages:  # prints the text of all messages
        print(message.text)
    print(memory.as_read_only())  # returns a new read only instance
    memory.reset()  # removes all messages


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

```

{/* <!-- embedme typescript/examples/memory/base.ts --> */}
```ts TypeScript [expandable]
import { UnconstrainedMemory } from "beeai-framework/memory/unconstrainedMemory";
import { AssistantMessage, SystemMessage, UserMessage } from "beeai-framework/backend/message";

const memory = new UnconstrainedMemory();

// Single message
await memory.add(new SystemMessage(`You are a helpful assistant.`));

// Multiple messages
await memory.addMany([new UserMessage(`What can you do?`), new AssistantMessage(`Everything!`)]);

console.info(memory.isEmpty()); // false
console.info(memory.messages); // prints all saved messages
console.info(memory.asReadOnly()); // returns a NEW read only instance
memory.reset(); // removes all messages

```

</CodeGroup>

### Usage with LLMs

<CodeGroup>

{/* <!-- embedme python/examples/memory/llm_memory.py --> */}
```py Python [expandable]
import asyncio
import sys
import traceback

from beeai_framework.adapters.ollama import OllamaChatModel
from beeai_framework.backend import AssistantMessage, SystemMessage, UserMessage
from beeai_framework.errors import FrameworkError
from beeai_framework.memory import UnconstrainedMemory


async def main() -> None:
    memory = UnconstrainedMemory()
    await memory.add_many(
        [
            SystemMessage("Always respond very concisely."),
            UserMessage("Give me the first 5 prime numbers."),
        ]
    )

    llm = OllamaChatModel("llama3.1")
    response = await llm.run(memory.messages)
    await memory.add(AssistantMessage(response.get_text_content()))

    print("Conversation history")
    for message in memory.messages:
        print(f"{message.role}: {message.text}")


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

```

{/* <!-- embedme typescript/examples/memory/llmMemory.ts --> */}
```ts TypeScript [expandable]
import { UnconstrainedMemory } from "beeai-framework/memory/unconstrainedMemory";
import { Message } from "beeai-framework/backend/message";
import { OllamaChatModel } from "beeai-framework/adapters/ollama/backend/chat";

const memory = new UnconstrainedMemory();
await memory.addMany([
  Message.of({
    role: "system",
    text: `Always respond very concisely.`,
  }),
  Message.of({ role: "user", text: `Give me first 5 prime numbers.` }),
]);

// Generate response
const llm = new OllamaChatModel("llama3.1");
const response = await llm.create({ messages: memory.messages });
await memory.add(Message.of({ role: "assistant", text: response.getTextContent() }));

console.log(`Conversation history`);
for (const message of memory) {
  console.log(`${message.role}: ${message.text}`);
}

```

</CodeGroup>


<Tip>
Memory for non-chat LLMs works exactly the same way.
</Tip>

### Usage with agents

<CodeGroup>

{/* <!-- embedme python/examples/memory/agent_memory.py --> */}
```py Python [expandable]
import asyncio
import sys
import traceback

from beeai_framework.agents.react import ReActAgent
from beeai_framework.backend import AssistantMessage, ChatModel, UserMessage
from beeai_framework.errors import FrameworkError
from beeai_framework.memory import UnconstrainedMemory

# Initialize the memory and LLM
memory = UnconstrainedMemory()


def create_agent() -> ReActAgent:
    llm = ChatModel.from_name("ollama:granite4:micro")

    # Initialize the agent
    agent = ReActAgent(llm=llm, memory=memory, tools=[])

    return agent


async def main() -> None:
    # Create user message
    user_input = "Hello world!"
    user_message = UserMessage(user_input)

    # Await adding user message to memory
    await memory.add(user_message)
    print("Added user message to memory")

    # Create agent
    agent = create_agent()

    response = await agent.run(
        user_input,
        max_retries_per_step=3,
        total_max_retries=10,
        max_iterations=20,
    )
    print(f"Received response: {response}")

    # Create and store assistant's response
    assistant_message = AssistantMessage(response.last_message.text)

    # Await adding assistant message to memory
    await memory.add(assistant_message)
    print("Added assistant message to memory")

    # Print results
    print(f"\nMessages in memory: {len(agent.memory.messages)}")

    if len(agent.memory.messages) >= 1:
        user_msg = agent.memory.messages[0]
        print(f"User: {user_msg.text}")

    if len(agent.memory.messages) >= 2:
        agent_msg = agent.memory.messages[1]
        print(f"Agent: {agent_msg.text}")
    else:
        print("No agent message found in memory")


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

```

{/* <!-- embedme typescript/examples/memory/agentMemory.ts --> */}
```ts TypeScript [expandable]
import { UnconstrainedMemory } from "beeai-framework/memory/unconstrainedMemory";
import { ReActAgent } from "beeai-framework/agents/react/agent";
import { OllamaChatModel } from "beeai-framework/adapters/ollama/backend/chat";

const agent = new ReActAgent({
  memory: new UnconstrainedMemory(),
  llm: new OllamaChatModel("llama3.1"),
  tools: [],
});
await agent.run({ prompt: "Hello world!" });

console.info(agent.memory.messages.length); // 2

const userMessage = agent.memory.messages[0];
console.info(`User: ${userMessage.text}`); // User: Hello world!

const agentMessage = agent.memory.messages[1];
console.info(`Agent: ${agentMessage.text}`); // Agent: Hello! It's nice to chat with you.

```
</CodeGroup>

<Tip>
If your memory already contains the user message, run the agent with `prompt: null`.
</Tip>

<Note>
ReAct Agent internally uses `TokenMemory` to store intermediate steps for a given run.
</Note>

---

## Memory types

The framework provides multiple out-of-the-box memory implementations for different use cases.

### UnconstrainedMemory

Unlimited in size, stores all messages without constraints.

<CodeGroup>
{/* <!-- embedme python/examples/memory/unconstrained_memory.py --> */}
```py Python [expandable]
import asyncio
import sys
import traceback

from beeai_framework.backend import UserMessage
from beeai_framework.errors import FrameworkError
from beeai_framework.memory import UnconstrainedMemory


async def main() -> None:
    # Create memory instance
    memory = UnconstrainedMemory()

    # Add a message
    await memory.add(UserMessage("Hello world!"))

    # Print results
    print(f"Is Empty: {memory.is_empty()}")  # Should print: False
    print(f"Message Count: {len(memory.messages)}")  # Should print: 1

    print("\nMessages:")
    for msg in memory.messages:
        print(f"{msg.role}: {msg.text}")


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

```

{/* <!-- embedme typescript/examples/memory/unconstrainedMemory.ts --> */}
```ts TypeScript [expandable]
import { UnconstrainedMemory } from "beeai-framework/memory/unconstrainedMemory";
import { Message } from "beeai-framework/backend/message";

const memory = new UnconstrainedMemory();
await memory.add(
  Message.of({
    role: "user",
    text: `Hello world!`,
  }),
);

console.info(memory.isEmpty()); // false
console.log(memory.messages.length); // 1
console.log(memory.messages);

```
</CodeGroup>


### SlidingMemory

Keeps last `k` entries in the memory. The oldest ones are deleted (unless specified otherwise).

<CodeGroup>

{/* <!-- embedme python/examples/memory/sliding_memory.py --> */}
```py Python [expandable]
import asyncio
import sys
import traceback

from beeai_framework.backend import AssistantMessage, SystemMessage, UserMessage
from beeai_framework.errors import FrameworkError
from beeai_framework.memory import SlidingMemory, SlidingMemoryConfig


async def main() -> None:
    # Create sliding memory with size 3
    memory = SlidingMemory(
        SlidingMemoryConfig(
            size=3,
            handlers={"removal_selector": lambda messages: messages[0]},  # Remove oldest message
        )
    )

    # Add messages
    await memory.add(SystemMessage("You are a helpful assistant."))

    await memory.add(UserMessage("What is Python?"))

    await memory.add(AssistantMessage("Python is a programming language."))

    # Adding a fourth message should trigger sliding window
    await memory.add(UserMessage("What about JavaScript?"))

    # Print results
    print(f"Messages in memory: {len(memory.messages)}")  # Should print 3
    for msg in memory.messages:
        print(f"{msg.role}: {msg.text}")


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

```
{/* <!-- embedme typescript/examples/memory/slidingMemory.ts --> */}
```ts TypeScript [expandable]
import { SlidingMemory } from "beeai-framework/memory/slidingMemory";
import { Message } from "beeai-framework/backend/message";

const memory = new SlidingMemory({
  size: 3, // (required) number of messages that can be in the memory at a single moment
  handlers: {
    // optional
    // we select a first non-system message (default behaviour is to select the oldest one)
    removalSelector: (messages) => messages.find((msg) => msg.role !== "system")!,
  },
});

await memory.add(Message.of({ role: "system", text: "You are a guide through France." }));
await memory.add(Message.of({ role: "user", text: "What is the capital?" }));
await memory.add(Message.of({ role: "assistant", text: "Paris" }));
await memory.add(Message.of({ role: "user", text: "What language is spoken there?" })); // removes the first user's message
await memory.add(Message.of({ role: "assistant", text: "French" })); // removes the first assistant's message

console.info(memory.isEmpty()); // false
console.log(memory.messages.length); // 3
console.log(memory.messages);

```
</CodeGroup>


### TokenMemory

Ensures that the token sum of all messages is below the given threshold.
If overflow occurs, the oldest message will be removed.

<CodeGroup>

{/* <!-- embedme python/examples/memory/token_memory.py --> */}
```py Python [expandable]
import asyncio
import math
import sys
import traceback

from beeai_framework.adapters.ollama import OllamaChatModel
from beeai_framework.backend import Role, SystemMessage, UserMessage
from beeai_framework.errors import FrameworkError
from beeai_framework.memory import TokenMemory

# Initialize the LLM
llm = OllamaChatModel()

# Initialize TokenMemory with handlers
memory = TokenMemory(
    llm=llm,
    max_tokens=None,  # Will be inferred from LLM
    capacity_threshold=0.75,
    sync_threshold=0.25,
    handlers={
        "removal_selector": lambda messages: next((msg for msg in messages if msg.role != Role.SYSTEM), messages[0]),
        "estimate": lambda msg: math.ceil((len(msg.role) + len(msg.text)) / 4),
    },
)


async def main() -> None:
    # Add system message
    system_message = SystemMessage("You are a helpful assistant.")
    await memory.add(system_message)
    print(f"Added system message (hash: {hash(system_message)})")

    # Add user message
    user_message = UserMessage("Hello world!")
    await memory.add(user_message)
    print(f"Added user message (hash: {hash(user_message)})")

    # Check initial memory state
    print("\nInitial state:")
    print(f"Is Dirty: {memory.is_dirty}")
    print(f"Tokens Used: {memory.tokens_used}")

    # Sync token counts
    await memory.sync()
    print("\nAfter sync:")
    print(f"Is Dirty: {memory.is_dirty}")
    print(f"Tokens Used: {memory.tokens_used}")

    # Print all messages
    print("\nMessages in memory:")
    for msg in memory.messages:
        print(f"{msg.role}: {msg.text} (hash: {hash(msg)})")


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

```
{/* <!-- embedme typescript/examples/memory/tokenMemory.ts --> */}
```ts TypeScript [expandable]
import { TokenMemory } from "beeai-framework/memory/tokenMemory";
import { Message } from "beeai-framework/backend/message";

const memory = new TokenMemory({
  maxTokens: undefined, // optional (default is 128k),
  capacityThreshold: 0.75, // maxTokens*capacityThreshold = threshold where we start removing old messages
  syncThreshold: 0.25, // maxTokens*syncThreshold = threshold where we start to use a real tokenization endpoint instead of guessing the number of tokens
  handlers: {
    // optional way to define which message should be deleted (default is the oldest one)
    removalSelector: (messages) => messages.find((msg) => msg.role !== "system")!,

    // optional way to estimate the number of tokens in a message before we use the actual tokenize endpoint (number of tokens < maxTokens*syncThreshold)
    estimate: (msg) => Math.ceil((msg.role.length + msg.text.length) / 4),
  },
});

await memory.add(Message.of({ role: "system", text: "You are a helpful assistant." }));
await memory.add(Message.of({ role: "user", text: "Hello world!" }));

console.info(memory.isDirty); // is the consumed token count estimated or retrieved via the tokenize endpoint?
console.log(memory.tokensUsed); // number of used tokens
console.log(memory.stats()); // prints statistics
await memory.sync(); // calculates real token usage for all messages marked as "dirty"

```

</CodeGroup>

### SummarizeMemory

Only a single summarization of the conversation is preserved. Summarization is updated with every new message.

<CodeGroup>

{/* <!-- embedme python/examples/memory/summarize_memory.py --> */}
```py Python [expandable]
import asyncio
import sys
import traceback

from beeai_framework.backend import AssistantMessage, ChatModel, SystemMessage, UserMessage
from beeai_framework.errors import FrameworkError
from beeai_framework.memory import SummarizeMemory


async def main() -> None:
    # Initialize the LLM with parameters
    llm = ChatModel.from_name(
        "ollama:granite4:micro",
        # ChatModelParameters(temperature=0),
    )

    # Create summarize memory instance
    memory = SummarizeMemory(llm)

    # Add messages
    await memory.add_many(
        [
            SystemMessage("You are a guide through France."),
            UserMessage("What is the capital?"),
            AssistantMessage("Paris"),
            UserMessage("What language is spoken there?"),
        ]
    )

    # Print results
    print(f"Is Empty: {memory.is_empty()}")
    print(f"Message Count: {len(memory.messages)}")

    if memory.messages:
        print(f"Summary: {memory.messages[0].get_texts()[0].text}")


if __name__ == "__main__":
    try:
        asyncio.run(main())
    except FrameworkError as e:
        traceback.print_exc()
        sys.exit(e.explain())

```
{/* <!-- embedme typescript/examples/memory/summarizeMemory.ts --> */}
```ts TypeScript [expandable]
import { Message } from "beeai-framework/backend/message";
import { SummarizeMemory } from "beeai-framework/memory/summarizeMemory";
import { OllamaChatModel } from "beeai-framework/adapters/ollama/backend/chat";

const memory = new SummarizeMemory({
  llm: new OllamaChatModel("llama3.1"),
});

await memory.addMany([
  Message.of({ role: "system", text: "You are a guide through France." }),
  Message.of({ role: "user", text: "What is the capital?" }),
  Message.of({ role: "assistant", text: "Paris" }),
  Message.of({ role: "user", text: "What language is spoken there?" }),
]);

console.info(memory.isEmpty()); // false
console.log(memory.messages.length); // 1
console.log(memory.messages[0].text); // The capital city of France is Paris, ...

```
</CodeGroup>

---

## Creating custom memory

To create your memory implementation, you must implement the `BaseMemory` class.

<CodeGroup>
{/* <!-- embedme python/examples/memory/custom.py --> */}
```py Python [expandable]
from typing import Any

from beeai_framework.backend import AnyMessage
from beeai_framework.memory import BaseMemory


class MyMemory(BaseMemory):
    @property
    def messages(self) -> list[AnyMessage]:
        raise NotImplementedError("Method not yet implemented.")

    async def add(self, message: AnyMessage, index: int | None = None) -> None:
        raise NotImplementedError("Method not yet implemented.")

    async def delete(self, message: AnyMessage) -> bool:
        raise NotImplementedError("Method not yet implemented.")

    def reset(self) -> None:
        raise NotImplementedError("Method not yet implemented.")

    def create_snapshot(self) -> Any:
        raise NotImplementedError("Method not yet implemented.")

    def load_snapshot(self, state: Any) -> None:
        raise NotImplementedError("Method not yet implemented.")

```

{/* <!-- embedme typescript/examples/memory/custom.ts --> */}
```ts TypeScript [expandable]
import { BaseMemory } from "beeai-framework/memory/base";
import { Message } from "beeai-framework/backend/message";
import { NotImplementedError } from "beeai-framework/errors";

export class MyMemory extends BaseMemory {
  get messages(): readonly Message[] {
    throw new NotImplementedError("Method not implemented.");
  }

  add(message: Message, index?: number): Promise<void> {
    throw new NotImplementedError("Method not implemented.");
  }

  delete(message: Message): Promise<boolean> {
    throw new NotImplementedError("Method not implemented.");
  }

  reset(): void {
    throw new NotImplementedError("Method not implemented.");
  }

  createSnapshot(): unknown {
    throw new NotImplementedError("Method not implemented.");
  }

  loadSnapshot(state: ReturnType<typeof this.createSnapshot>): void {
    throw new NotImplementedError("Method not implemented.");
  }
}

```

</CodeGroup>

<Tip>
The simplest implementation is `UnconstrainedMemory`.
</Tip>

---

## Examples

<CardGroup cols={2}>
  <Card title="Python" icon="python" href="https://github.com/i-am-bee/beeai-framework/tree/main/python/examples/memory">
    Explore reference memory implementations in Python
  </Card>
  <Card title="TypeScript" icon="js" href="https://github.com/i-am-bee/beeai-framework/tree/main/typescript/examples/memory">
    Explore reference memory implementations in TypeScript
  </Card>
</CardGroup>
